SJTU at TREC 2004: Web Track Experiments

نویسندگان

Yiming Lu

Jian Hu

Fanyuan Ma

چکیده

Yiming Lu, Jian Hu, Fanyuan Ma ( Department of Computer Science & Engineering , S hanghai Jiaotong University , S hanghai 200030) {luyiniao , hujian , ma-fy}@sjtu.edu.cn Abstract: This is the first year our lab to participate in Trec. We participate in Mixed-Query task for the Web track. All the runs we submitted are based on the modified Okapi weighting scheme. Besides, we used several heuristics as the re-rank method: site-merging, minimal span weight, and etc. Also, the PageRank of a document is combined with the similarity of the document with the query to obtain an overall ranking of documents. Especially for the mixed-query task, we try a simple classification method to estimate whether the query is topic distillation or entry-page finding.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Novel Approaches in Text Information Retrieval - Experiments in the Web Track of TREC 2004

In this paper, we report our experiments in the mixed query task of the Web track for TREC 2004. We deal with the problem of ranking Web documents within a multicriteria framework and propose a novel approach for information retrieval. We focus on the design of a set of criteria aiming at capturing complementary aspects of relevance. Moreover, we provide aggregation procedures that are based on...

متن کامل

Overview of the TREC 2004 Web Track

This year’s main experiment involved processing a mixed query stream, with an even mix of each query type studied in TREC-2003: 75 homepage finding queries, 75 named page finding queries and 75 topic distillation queries. The goal was to find ranking approaches which work well over the 225 queries, without access to query type labels. We also ran two small experiments. First, participants were ...

متن کامل

TREC 2004 Web Track Experiments at CAS-ICT

This report presents CAS-ICT’s experiments on the Mixed query task of the TREC2004 Web track. Our work focused on combining different Web page evidences together to improve the overall retrieval performance. Four kinds of evidences, including body content(C), anchor texts (AT), basic structural information (S0) and extended structural information (S1) were considered for retrieval. Six combinat...

متن کامل

RMIT University at TREC 2004

RMIT University participated in two tracks at TREC 2004: Terabyte and Genomics, both for the first time. This paper describes the techniques we applied and our experiments in both tracks, and discusses the results of the genomics track runs; the terabyte track results are unavailable at the time of manuscript submission. We also describe our new zettair search engine, in use for the first time ...

متن کامل

Indri at TREC 2004: Terabyte Track

This paper provides an overview of experiments carried out at the TREC 2004 Terabyte Track using the Indri search engine. Indri is an efficient, effective distributed search engine. Like INQUERY, it is based on the inference network framework and supports structured queries, but unlike INQUERY, it uses language modeling probabilities within the network which allows for added flexibility. We des...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

SJTU at TREC 2004: Web Track Experiments

نویسندگان

چکیده

منابع مشابه

Novel Approaches in Text Information Retrieval - Experiments in the Web Track of TREC 2004

Overview of the TREC 2004 Web Track

TREC 2004 Web Track Experiments at CAS-ICT

RMIT University at TREC 2004

Indri at TREC 2004: Terabyte Track

عنوان ژورنال:

اشتراک گذاری